Skip to content

Conversation

@chatton
Copy link
Contributor

@chatton chatton commented Jan 28, 2026

Overview

This PR adds a compatibility test that will deploy a specified image, execute some transactions, then deploy the PR image and verify that syncing still works correctly.

closes #3019

Add EVMCompatTestSuite to test syncing between different ev-node
versions. The test:
- Starts sequencer with version A (env: SEQUENCER_EVM_IMAGE_*)
- Submits transactions
- Starts full node with version B (env: FULLNODE_EVM_IMAGE_*)
- Verifies full node syncs and can query transactions
- Submits post-sync transactions to verify ongoing sync

Add make test-docker-compat target for running the test.
@claude
Copy link
Contributor

claude bot commented Jan 28, 2026

Claude finished @chatton's task in 1m 51s —— View job


Comprehensive Code Review

I've reviewed the compatibility test suite implementation and related changes. Here's my detailed feedback organized by key areas:


1. Code Quality & Architecture ✅

Strengths:

  • Clean separation of concerns with EVMCompatTestSuite embedding DockerTestSuite for infrastructure reuse
  • Well-structured test flow with logical sub-runs that clearly delineate test phases
  • Excellent refactoring: extracting EVM test constants (lines 42-46 in docker_test.go) into shared package scope promotes DRY
  • Good builder pattern usage throughout (e.g., evmsingle.NewNodeConfigBuilder())
  • Consistent naming conventions and idiomatic Go

Observations:

  • The RethSetupConfig rename provides better clarity than the old name
  • New helper functions SetupEVMSingle() and related types improve reusability across test suites

2. Error Handling & Resilience ⚠️

Issues Found:

Critical - Fixed Sleep Anti-Pattern (compat_test.go:68)

// wait for blocks to be posted to DA before starting full node
time.Sleep(5 * time.Second)

This is already noted by gemini-code-assist but worth emphasizing:

  • Problem: Fixed sleep introduces test flakiness - DA posting time varies based on system load, network conditions, and CI environment
  • Impact: Tests may fail intermittently on slower CI runners or pass when they shouldn't on fast machines
  • Recommendation: Replace with a conditional wait that polls for actual DA state or waits for additional blocks

Suggested Fix:

s.T().Log("Waiting for DA propagation...")
lastHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
targetHeight := lastHeight + 3 // wait for 3 more blocks (3s at 1s block time)
s.Require().Eventually(func() bool {
    h, err := s.sequencerClient.BlockNumber(ctx)
    return err == nil && h >= targetHeight
}, 15*time.Second, time.Second, "sequencer did not produce blocks for DA propagation")

Fix the sleep issue →

Good Patterns:

  • Proper use of s.Require().Eventually() in waitForSync() (line 205-208)
  • Good timeout values throughout (30s-120s) that balance test speed vs. reliability
  • Proper error wrapping with context in assertions

3. Security 🔒

Observations:

  • Hardcoded test private key (evmTestPrivateKey) is appropriately scoped to test-only code with build tags
  • No credential exposure in logs
  • JWT secrets properly generated and passed through configuration
  • Good use of Docker network isolation

Minor Note:

  • The passphrase "secret" (line 103, 104 in compat_test.go) is fine for tests but worth calling out that this is test-only

4. Performance & Resource Efficiency ⚡

Strengths:

  • Appropriate use of goroutine-safe Eventually helpers
  • Good connection reuse patterns with ethclient.Client
  • Proper cleanup considerations with Docker volumes

Concerns:

Resource Cleanup

  • I don't see explicit cleanup/teardown for the Docker containers in EVMCompatTestSuite
  • The DockerTestSuite doesn't show a TearDownSuite() method
  • Question: Are containers cleaned up automatically by the test framework, or should there be explicit cleanup?

Test Duration

  • Running 50 transactions sequentially (line 64) with individual wait for each (line 198) could be slow
  • Recommendation: Consider batching transaction submissions or parallelizing waits if test duration becomes an issue

5. Testing & Reliability 🧪

Strengths:

  • Excellent test coverage of the sync compatibility scenario
  • Good verification strategy: comparing sequencer vs full node receipts and block numbers
  • Tests both pre-sync and post-sync transaction propagation
  • Clear test organization with sub-runs

Issues:

Test Robustness (compat_test.go:213-224)

func (s *EVMCompatTestSuite) verifyTransactionsOnFullNode(ctx context.Context, txHashes []common.Hash) {
    for i, txHash := range txHashes {
        seqReceipt, err := s.sequencerClient.TransactionReceipt(ctx, txHash)
        s.Require().NoError(err, "failed to query tx %d on sequencer: %s", i, txHash.Hex())

        fnReceipt, err := s.fullNodeClient.TransactionReceipt(ctx, txHash)
        s.Require().NoError(err, "failed to query tx %d on full node: %s", i, txHash.Hex())
  • Concern: No retry logic if full node hasn't fully synced individual transactions yet
  • The waitForSync() only checks block height, not transaction availability
  • Recommendation: Add a small Eventually wrapper around the full node transaction receipt query

Suggested Enhancement:

func (s *EVMCompatTestSuite) verifyTransactionsOnFullNode(ctx context.Context, txHashes []common.Hash) {
    for i, txHash := range txHashes {
        seqReceipt, err := s.sequencerClient.TransactionReceipt(ctx, txHash)
        s.Require().NoError(err, "failed to query tx %d on sequencer: %s", i, txHash.Hex())

        var fnReceipt *types.Receipt
        s.Require().Eventually(func() bool {
            var err error
            fnReceipt, err = s.fullNodeClient.TransactionReceipt(ctx, txHash)
            return err == nil
        }, 10*time.Second, 500*time.Millisecond, "failed to query tx %d on full node: %s", i, txHash.Hex())

        s.Require().Equal(uint64(1), fnReceipt.Status, "tx %d should have successful status", i)
        // ... rest of verification
    }
}

6. Documentation & Developer Experience 📚

Strengths:

  • Good function-level comments describing purpose
  • Clear variable naming
  • Helpful test logging at key points

Improvements Needed:

Missing Documentation:

  1. Package-level documentation: compat_test.go lacks a package comment explaining the test suite's purpose and scope

  2. Configuration documentation (compat_test.go:228-250):

func getSequencerImage() container.Image {
    repo := strings.TrimSpace(os.Getenv("SEQUENCER_EVM_IMAGE_REPO"))
    if repo == "" {
        repo = "ghcr.io/evstack/ev-node-evm"
    }
    // ...
}
  • Missing: Comments documenting the environment variables and their purpose
  • Missing: Example usage in CI/local development

Suggested Addition:

// getSequencerImage returns the Docker image for the sequencer node.
// Controlled by environment variables:
//   - SEQUENCER_EVM_IMAGE_REPO: Docker repository (default: ghcr.io/evstack/ev-node-evm)
//   - SEQUENCER_EVM_IMAGE_TAG: Image tag (default: main)
// Example: SEQUENCER_EVM_IMAGE_TAG=v1.0.0 make test-docker-compat
  1. Workflow TODO (.github/workflows/docker-tests.yml:68):
SEQUENCER_EVM_IMAGE_TAG: main # TODO: set this to the last released tag
  • This TODO should be addressed before merging or converted to a GitHub issue

7. Additional Findings

Go Version Update

  • Multiple go.mod files updated from Go 1.25.0 → 1.25.6 and dependency updates
  • ✅ Good practice to stay current with patch releases
  • Ensure CI is using Go 1.25.6 as well

Dependency Updates

  • New dependencies like github.com/libp2p/go-libp2p/core added
  • Several version bumps (fatih/color, google/flatbuffers, etc.)
  • ⚠️ No visible breaking changes, but ensure these have been tested

Test Organization

  • New make target test-docker-compat properly added to scripts/test.mk
  • Build tags properly used (//go:build docker_e2e && evm)
  • ✅ Good integration with existing test infrastructure

Summary & Recommendations

Must Fix Before Merge:

  1. 🔴 Replace time.Sleep(5 * time.Second) at compat_test.go:68 with conditional wait
  2. 🟡 Add retry logic to verifyTransactionsOnFullNode() for robustness
  3. 🟡 Address or document the TODO in .github/workflows/docker-tests.yml:68

Should Consider:
4. 🟢 Add documentation for environment variables in image getter functions
5. 🟢 Add package-level documentation to compat_test.go
6. 🟢 Verify Docker container cleanup strategy
7. 🟢 Consider adding a test case for sync failure scenarios

Excellent Work:

  • Strong test architecture and reusability improvements
  • Clear test flow and good use of sub-tests
  • Proper verification of cross-version compatibility
  • Good refactoring of test utilities

Overall Assessment

This PR adds valuable cross-version compatibility testing infrastructure. The code quality is generally high with good Go practices. The main concern is the fixed sleep anti-pattern which could cause test flakiness. With the recommended fixes, this will be a robust addition to the test suite.

Approval Status: Recommend addressing the fixed sleep issue (#1) before merge. Items #2-3 are important but could be addressed in follow-up PRs if needed.


@github-actions
Copy link
Contributor

github-actions bot commented Jan 28, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedFeb 2, 2026, 4:37 PM

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the testing infrastructure by introducing a robust Docker E2E compatibility test. This new test ensures that different versions of the ev-node-evm can seamlessly synchronize, specifically verifying that a newer full node can correctly process and validate data from an older sequencer. The changes also include refactoring of test utilities to promote cleaner, more maintainable code and updating core dependencies to their latest versions, contributing to overall system stability and future compatibility.

Highlights

  • New Compatibility Test Suite: A new Docker E2E test suite, EVMCompatTestSuite, has been introduced to verify cross-version syncing capabilities of the ev-node-evm. This test deploys an older sequencer image, executes transactions, and then deploys the current PR image as a full node to ensure it can correctly sync and validate the historical transactions.
  • Docker E2E Test Utilities Refactoring: Existing Docker E2E test utilities have been refactored for improved reusability and clarity. This includes defining shared EVM test constants, renaming RethSetup to RethSetupConfig, and introducing EVMSingleSetupConfig and SetupEVMSingle helper functions to streamline node setup.
  • Dependency and Go Version Updates: The Go version has been updated from 1.25.0 to 1.25.6, and several Go module dependencies, including github.com/fatih/color, github.com/google/flatbuffers, github.com/hashicorp/go-hclog, and github.com/celestiaorg/tastora, have been updated to newer versions. New dependencies like github.com/libp2p/go-libp2p/core were also added.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker-tests.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable compatibility test suite to ensure syncing works across different node versions. The new EVMCompatTestSuite effectively simulates a real-world scenario where a new full node syncs from a sequencer running a different version. The changes also include beneficial refactoring of the existing test suite, such as centralizing test constants and creating reusable helper functions for setting up test nodes. My review includes one suggestion to improve the robustness of the new test by replacing a fixed sleep with a conditional wait.

Comment on lines +68 to +71
time.Sleep(5 * time.Second)

sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a fixed time.Sleep can introduce flakiness into tests, as the time required for an operation can vary between test runs. A more robust approach is to wait for a specific condition. Here, you're waiting for blocks to be posted to the DA. While directly querying the DA might be complex, you can achieve a more reliable wait by ensuring a few more blocks are produced by the sequencer. This provides a time window for DA propagation that is tied to block production rather than a fixed duration.

s.T().Log("Waiting for a few blocks to allow for DA propagation...")
	lastTxHeight, err := s.sequencerClient.BlockNumber(ctx)
	s.Require().NoError(err)
	const daPropagationBlocks = 3
	targetHeight := lastTxHeight + daPropagationBlocks
	s.Require().Eventually(func() bool {
		h, err := s.sequencerClient.BlockNumber(ctx)
		return err == nil && h >= targetHeight
	}, 15*time.Second, time.Second, "sequencer did not produce new blocks for DA propagation")

	sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
	s.Require().NoError(err)

@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.01%. Comparing base (0189cd5) to head (92657a1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3023      +/-   ##
==========================================
+ Coverage   55.95%   56.01%   +0.05%     
==========================================
  Files         117      117              
  Lines       11843    11843              
==========================================
+ Hits         6627     6634       +7     
+ Misses       4486     4479       -7     
  Partials      730      730              
Flag Coverage Δ
combined 56.01% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chatton chatton marked this pull request as ready for review January 29, 2026 14:54
@julienrbrt julienrbrt added this pull request to the merge queue Feb 1, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 1, 2026
@julienrbrt julienrbrt enabled auto-merge February 2, 2026 16:37
@julienrbrt julienrbrt added this pull request to the merge queue Feb 2, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 2, 2026
@julienrbrt julienrbrt added this pull request to the merge queue Feb 2, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[E2E] Sync Compatibility Test

3 participants